pc algorithm
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
- North America > United States (0.04)
- Asia > India (0.04)
A Supplement
Here we provide proofs of the statements made in the main text as well as further figures of numerical experiments and a more detailed discussion of heteroskedasticity effects regarding causal discovery. Z. Testing whether the Pearson correlation between X and Y is zero is equivalent to testing whether the slope parameter β is equal to zero. Therefore, this is a homoskedastic problem. A.1.2 Discussion of Effect 2: We start by discussing the homoskedastic case to see where non-constant variance of noise leads to problems within the t-test. For homoskedastic noise the second factor is an estimator of the standard error of ˆβ, which we derive by using the mean of the squared residual as an estimator for the error variance.
- Europe > Germany > Berlin (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Europe > Spain > Canary Islands (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
- North America > United States > Maryland > Baltimore (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- Asia > China > Hong Kong (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- North America > Canada > Quebec > Montreal (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Health & Medicine > Therapeutic Area > Oncology (1.00)
- Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Realizing LLMs' Causal Potential Requires Science-Grounded, Novel Benchmarks
Srivastava, Ashutosh, Nagalapatti, Lokesh, Jajoo, Gautam, Vashishtha, Aniket, Krishnamurthy, Parameswari, Sharma, Amit
Recent claims of strong performance by Large Language Models (LLMs) on causal discovery are undermined by a key flaw: many evaluations rely on benchmarks likely included in pretraining corpora. Thus, apparent success suggests that LLM-only methods, which ignore observational data, outperform classical statistical approaches. We challenge this narrative by asking: Do LLMs truly reason about causal structure, and how can we measure it without memorization concerns? Can they be trusted for real-world scientific discovery? We argue that realizing LLMs' potential for causal analysis requires two shifts: (P.1) developing robust evaluation protocols based on recent scientific studies to guard against dataset leakage, and (P.2) designing hybrid methods that combine LLM-derived knowledge with data-driven statistics. To address P.1, we encourage evaluating discovery methods on novel, real-world scientific studies. We outline a practical recipe for extracting causal graphs from recent publications released after an LLM's training cutoff, ensuring relevance and preventing memorization while capturing both established and novel relations. Compared to benchmarks like BNLearn, where LLMs achieve near-perfect accuracy, they perform far worse on our curated graphs, underscoring the need for statistical grounding. Supporting P.2, we show that using LLM predictions as priors for the classical PC algorithm significantly improves accuracy over both LLM-only and purely statistical methods. We call on the community to adopt science-grounded, leakage-resistant benchmarks and invest in hybrid causal discovery methods suited to real-world inquiry.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- North America > United States > Virginia > Arlington County > Arlington (0.04)
- North America > United States > Illinois (0.04)
- (4 more...)
Causal Explanation of Concept Drift -- A Truly Actionable Approach
Komnick, David, Lammers, Kathrin, Hammer, Barbara, Vaquet, Valerie, Hinder, Fabian
In a world that constantly changes, it is crucial to understand how those changes impact different systems, such as industrial manufacturing or critical infrastructure. Explaining critical changes, referred to as concept drift in the field of machine learning, is the first step towards enabling targeted interventions to avoid or correct model failures, as well as malfunctions and errors in the physical world. Therefore, in this work, we extend model-based drift explanations towards causal explanations, which increases the actionability of the provided explanations. We evaluate our explanation strategy on a number of use cases, demonstrating the practical usefulness of our framework, which isolates the causally relevant features impacted by concept drift and, thus, allows for targeted intervention.
- North America > United States > Wisconsin > Dane County > Madison (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Germany (0.04)
- North America > United States > California > Alameda County > Berkeley (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- North America > Canada (0.04)
- Asia > China > Hong Kong (0.04)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine (1.00)
We thank reviewers for their constructive comments, please see below for our response
We thank reviewers for their constructive comments, please see below for our response. We will make this clear in the revised version. We will include the new results in the revision. Reviewer#2-1-Why SVT suffers from low accuracy. PC's original privacy guarantee might not hold because the sensitivity of the utility score calculated with greedy search We will make the statement more clear in the revision.